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Abstract — We investigate the limits of communication over the 
discrete-time Additive White Gaussian Noise (AWGN) channel, 
when the channel output is quantized using a small number 
of bits. We first provide a proof of our recent conjecture on 
the optimality of a discrete input distribution in this scenario. 
Specifically, we show that for any given output quantizer choice 
with K quantization bins (i.e., a precision of log 2 K bits), the 
input distribution, under an average power constraint, need not 
have any more than K + 1 mass points to achieve the channel 
capacity. The cutting-plane algorithm is employed to compute 
this capacity and to generate optimum input distributions. 
Numerical optimization over the choice of the quantizer is then 
performed (for 2-bit and 3-bit symmetric quantization), and 
the results we obtain show that the loss due to low-precision 
output quantization, which is small at low signal-to-noise ratio 
(SNR) as expected, can be quite acceptable even for moderate 
to high SNR values. For example, at SNRs up to 20 dB, 2-3 bit 
quantization achieves 80-90% of the capacity achievable using 
infinite-precision quantization. 

I. Introduction 

Analog-to-digital conversion (ADC) is an integral part of 
modern communication receiver architectures based on digital 
signal processing (DSP). Typically, ADCs with 6-12 bits of 
precision are employed at the receiver to convert the received 
analog baseband signal into digital form for further processing. 
However, as the communication systems scale up in speed and 
bandwidth (for e.g., systems operating in the ultrawide band or 
the mm-wave band), the cost and power consumption of such 
high precision ADC becomes prohibitive [1]. A DSP-centric 
architecture nonetheless remains attractive, due to the continu- 
ing exponential advances in digital electronics (Moore's law). 
It is of interest, therefore, to understand whether DSP-centric 
design is compatible with the use of low-precision ADC. 

In this paper, we continue our investigation of the Shannon- 
theoretic communication limits imposed by the use of low- 
precision ADC for ideal Nyquist sampled linear modulation 
in AWGN. The discrete-time memoryless AWGN-Quantized 
Output (AWGN-QO) channel model thus induced is shown 
in Fig. Q] In our prior work for this channel model, we 
have shown that for the extreme scenario of 1-bit symmetric 
quantization, binary antipodal signaling achieves the channel 
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Fig. 1. The AWGN-Quantized Ouput Channel : Y = Q(X + N) 



capacity for any signal-to-noise ratio (SNR) [2]. For multi- 
bit quantization [3], we provided a duality-based approach to 
bound the capacity from above, and employed the cutting- 
plane algorithm to generate input distributions that nearly 
achieved these upper bounds. Based on our results, we con- 
jectured that a discrete input with cardinality not exceeding 
the number of quantization bins achieves the capacity of the 
average power constrained AWGN-QO channel. In this work, 
we prove that a discrete input is indeed optimal, although our 
result only guarantees its cardinality to be at most K + 1, 
where K is the number of quantization bins. Our proof 
is inspired by Witsenhausen's result in [4], where Dubins' 
theorem [5] was used to show that the capacity of a discrete- 
time memoryless channel with output cardinality K, under 
only a peak power constraint is achievable by a discrete input 
with at most K points. The key to our proof is to show 
that, under output quantization, an average power constraint 
automatically induces a peak power constraint, after which we 
use Dubins' theorem as done by Witsenhausen. Although not 
applicable to our setting, it is worth noting that for a Discrete 
Memoryless Channel, Gallager first showed that the number 
of inputs with nonzero probability mass need not exceed the 
number of outputs [6, p. 96, Corollary 3]. 

While the preceding results optimize the input distribution 
for a fixed quantizer, comparison with an unquantized system 
requires optimization over the choice of the quantizer as 
well. We do this numerically for 2-bit and 3-bit symmetric 
quantization, and use our numerical results to make the 
following encouraging observations: (a) Low-precision ADC 
incurs a relatively small loss in spectral efficiency compared 
to unquantized observations. While this is expected for low 
SNRs, we find that even at moderately high SNRs of up to 
20 dB, 2-3 bit ADC still achieves 80-90% of the spectral effi- 
ciency attained using unquantized observations. These results 
indicate the feasibility of system design using low-precision 
ADC for high bandwidth systems, (b) Standard uniform Pulse 
Amplitude Modulated (PAM) input with quantizer thresholds 



set to implement maximum likelihood (ML) hard decisions 
achieves nearly the same performance as that attained by an 
optimal input and quantizer pair. This is useful from a system 
designer's point of view, since the ML quantizer thresholds 
have a simple analytical dependence on SNR, which is an 
easily measurable quantity. 

The rest of the paper is organized as follows. The quantized 
output AWGN channel model is given in the next section. In 
Section|nIl we show that a discrete input achieves the capacity 
of this channel. Quantizer optimization results are presented 
in Section [IV] followed by the conclusions in Section [V] 

II. Channel Model 

We consider linear modulation over a real AWGN channel, 
and assume that the Nyquist criterion for no intersymbol in- 
terference is satisfied [7, pp. 50]. Symbol rate sampling of the 
receiver's matched filter output using a finite-precision ADC 
therefore results in the following discrete-time memoryless 
AWGN-Quantized Output (AWGN-QO) channel (Fig. 1) 



Y = Q (X + N) 



(1) 



Here X G R is the channel input with distribution F(x) and 
N is 7V(0, a 1 ). The quantizer Q maps the real valued input 
X + N to one of the K bins, producing a discrete channel 
output Y G {yi, ■ ■ ■ ,Uk}- We only consider quantizers for 
which each bin is an interval of the real line. The quantizer 
Q with K bins can therefore be characterized by the set of 
its (K — 1) thresholds q = [qi,q2, • ■ ■ , Qk-i] G R x_1 , such 
that — oo := go < 5i < 92 < • • • < Qk-i < Qk ■= oo. The 
resulting transition probability functions are given by 



Wi(x) = P(Y = yi \X = x) 



-Q 



(2) 

where Q(x) denotes the complementary Gaussian distribution 
function -^j= f™ exp(-i 2 /2)eft. 

The input-output mutual information I(X;Y), expressed 
explicitly as a function of F is 

/OO K TIT / \ 
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(3) 



where {R(yi\F) , 1 < i < K} is the Probability Mass 
Function (PMF) of the output when the input is F, Under 
an average power constraint P (i.e., E[A 2 ] < P), we wish to 
compute the capacity of the channel £[), which is given by 



C 



(4) 



sup 1(F), 

where T is the set of all average power constrained distribu- 
tions on ML 

III. Discrete Input Achieves Capacity 

We first use the Karush-Kuhn-Tucker (KKT) optimality 
condition to show that an average power constraint for the 
AWGN-QO channel automatically induces a constraint on the 
peak power, in the sense that an optimal input distribution 
must have a bounded support set. This fact is then exploited 
to show the optimality of a discrete input. 



A. An Implicit peak power Constraint 

The following KKT condition can be derived for the 
AWGN-QO channel, using convex optimization principles in 
a manner similar to that in [8], [9]. The input distribution F 
is optimal if and only if there exists a 7 > such that 

yWi(x) log 
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■j(P-xT) < 1(F) 



(5) 



for all x, with equality if x is in the support of F. 

The first term on the left hand side of the KKT condition 
(O is the divergence (or the relative entropy) between the 
transition and the output PMFs. For convenience, let us denote 
it by d(x; F). The following result concerning the behavior of 
d(x; F) has been proved in [10]. 

Lemma 1: For the AWGN-QO channel ([1]) with input dis- 
tribution F, the divergence function d(x; F) satisfies the 
following properties 

(a) lim d(x;F) = -log R(y K ; F). 

X — >oc 

(b) There exists a finite constant Aq such that V x > Aq, 
d(x;F) < -logR(y K] F). 

Proof: See [10]. 
We now use Lemma Q] to prove the main result of this 
subsection. 

Proposition 1: A capacity-achieving input distribution for 
the average power constrained AWGN-QO channel (Q~|) must 
have bounded support. 

Proof: Assume that the input distribution F* achieve^] the 
capacity in (0]i (i.e., I(F*) = C), with 7* > being 
a corresponding optimal Lagrange parameter in the KKT 
condition. In other words, with 7 = 7*, and, F = F*, ® must 
be satisfied with an equality at every point in the support of 
F*. We exploit this necessary condition next to show that the 
support of F* is upper bounded. Specifically, we prove that 
there exists a finite constant A2* such that it is not possible 
to attain equality in <(5j for any x > A2* . 

Using Lemma 1, we first let 
lim d(x;F*) — — log(R(yK", F*)) = L, and also assume 

x — >oo 

that there exists a finite constant A such that V x > Aq, 
d(x; F*) < L. We consider two possible cases. 
. Case 1: 7* > 0. 

If C > L + 7*P, then pick A 2 * = A . 

Else pick A 2 * > max{A , y/(L + j*P - C) /-/*}. 

In either situation, for x > A 2 * , we get d(x;F*) < L, 

and, 7 *x 2 > L + 7*P - C. 

This gives 

d(x;F*)+j*(P-x 2 ) < L + j*P-(L + j*P-C) = C. 

. Case 2: 7* = 0. 

Putting 7* = in the KKT condition ©, we get 

K W (r) 

d(x; F*) = Wi(x) log „, %K Jr. ; < C , Var. 
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R(yt;F*) 



That the capacity is achievable can be shown using standard results from 
optimization theory. For lack of space here, we refer the reader to [10] for 
details. 



Thus, 

L = lim d(x;F*) < C. 

x — >oo 

Picking A2* = Aq, we therefore have that for x > A2* 

d{x; F*) + 7 *(P - x 2 ) = d{x; F*) < L. 
=► d{x;F*) + 1 *{P-x 2 ) < C. 

Combining the two cases, we have shown that the support of 
the distribution F* has a finite upper bound A2* . Using similar 
arguments, it can easily be shown that the support of F* has 
a finite lower bound A%* as well, which implies that F* has 
a bounded support. ■ 

B. Achievability of Capacity by a Discrete Input 

To show the optimality of a discrete input for our problem, 
we use the following theorem which we have proved in [10]. 
The theorem holds for channels with a finite output alphabet, 
under the condition that the input is constrained in both peak 
power and average power. 

Theorem 1: Consider a stationary discrete-time memo- 
ryless channel with a continuous input X taking values 
in the bounded interval [Ai,^], and a discrete output 
Y £ {2/1 ; 2/2) ' ' ' iUk}- Let the transition probability function 
Wi(x) = P(Y = Ui\X = x) be continuous in x, for each i 
in {1, .., K}. The capacity of this channel, under an average 
power constraint on the input, is achievable by a discrete input 
with at most K + 1 points. 

Proof: See [10]. ■ 
Our proof in [10] uses Dubins' theorem [5], and is an 
extension of Witsenhausen's result in [4], wherein he showed 
that a distribution with only K points would be sufficient to 
achieve the capacity if the average power of the input was not 
constrained. 

The implicit peak power constraint derived in Section IIII-AI 
allows us to use Theorem 1 to get the following result. 

Proposition 2: The capacity of the average power con- 
strained AWGN-QO channel ([1) is achievable by a discrete 
input distribution with at most K + 1 points of support. 
Proof: Using notation from the last subsection, let F* be an 
optimal distribution for @, with the support of F* being 
contained in the bounded interval [Ai* , A^*]. Define T\ to be 
the set of all average power constrained distributions whose 
support is contained in \A 1 *,A 2 *\. Note that F* e T\ C 
T, where T is the set of all average power constrained 
distributions on R. Consider the maximization of the mutual 
information I(X; Y) over the set T\ 

Cx = max 1(F). (6) 

Since the transition probability functions in are continuous 
in x, Theorem 1 implies that a discrete distribution with at 
most K + 1 mass points achieves the maximum C\ in ||6). 
Denote such a distribution by F\. However, since F* achieves 
the maximum C in dU and F* G T\, it must also achieve the 
maximum in (0. This implies that C\ = C, and that F\ is 
optimal for thus completing the proof. ■ 



C. Capacity Computation 

We have already addressed the issue of computing the 
capacity in our prior work. Specifically, in [2], we have 
shown analytically that for the extreme scenario of 1-bit 
symmetric quantization, binary antipodal signaling achieves 
the capacity (at any SNR). Multi-bit quantization has been 
considered in [3], [10], where we show that the cutting-plane 
algorithm [11] can be employed for computing the capacity 
and obtaining optimal input distributions. 

IV. Optimization Over Quantizer 

Until now, we have addressed the problem of capacity com- 
putation given a fixed quantizer. In this section, we consider 
the issue of quantizer optimization, while restricting attention 
to symmetric quantizers only. Given the symmetric nature of 
the AWGN noise and the power constraint, it seems intuitively 
plausible that restriction to symmetric quantizers should not 
be sub-optimal from the point of view of optimizing over the 
quantizer choice in (03, although a proof of this conjecture has 
eluded us. 

A Simple Benchmark: While an optimal quantizer (with a 
corresponding optimal input) provides the absolute commu- 
nication limits for our model, from a system designer's per- 
spective, it would also be useful to evaluate the performance 
degradation if we use some standard input constellations and 
quantizer choices. We take the following input and quantizer 
pair as our benchmark strategy : for K-bin quantization, 
consider equispaced uniform K-PAM (Pulse Amplitude Mod- 
ulated) input distribution, with quantizer thresholds as the 
mid-points of the input mass point locations (i.e., ML hard 
decisions). With the if-point uniform input, we have the 
entropy H(X) = log 2 K bits for any SNR. Also, it is easy 
to see that as SNR — > 00, H(X\Y) — ► for the benchmark 
input-quantizer pair. Therefore, our benchmark scheme is near- 
optimal if we operate in the high SNR regime. The main issue 
to investigate ahead, therefore is: at low to moderate SNRs, 
how much gain does an optimal quantizer choice provide over 
the benchmark. 

In all the results that follow, we take the noise variance 
a 2 = 1. However, the results are scale invariant in the sense 
that if both P and a 2 are scaled by the same factor R (thus 
keeping the SNR unchanged), then there is an equivalent 
quantizer (obtained by scaling the thresholds by \/R) that 
gives an identical performance. 

Numerical Results 

A. 2-bit Symmetric Quantization 

A 2-bit symmetric quantizer is characterized by a single 
parameter q, with {—q,0,q} being the quantizer thresholds. 
Hence we use a brute force search over q to optimize the 
quantizer. In Fig. we plot the variation of the channel 
capacity (computed using the cutting-plane algorithm) as a 
function of the parameter q at various SNRs. We observe that 
for any SNR, there is an optimal choice of q that maximizes 
the capacity. At high SNRs, the optimal q is seen to increase 
monotonically with SNR, which is not surprising since the 
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Fig. 2. 2-bit symmetric quantization : channel capacity versus the quantizer 
threshold q (noise variance cr 2 = 1). 
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TABLE I 

Mutual information (in bits/channel use) at different SNRs. 



benchmark quantizer's q scales as VSNR and is known to be 
near-optimal at high SNRs. 

Comparison with the benchmark: In Table [I] we compare 
the performance of the optimal solution obtained as above with 
the benchmark scheme. The capacity with 1-bit quantization 
is also shown for reference. While being near-optimal at 
moderate to high SNRs, the benchmark scheme is seen to 
perform fairly well at low SNRs also. For instance, at —10 
dB SNR, it achieves 86% of the capacity achieved with 
an optimal 2-bit quantizer and input pair. From a practical 
standpoint, these results imply that the benchmark scheme, 
which requires negligible computational effort (due to its well- 
defined dependence on SNR), can be employed even at small 
SNRs while incurring an acceptable loss of performance. 




Fig. 3. 2-bit symmetric quantization : optimal input distribution and quantizer 
at various SNRs (the dashed vertical lines depict the locations of the quantizer 
thresholds). 



TABLE II 

Mutual information (in bits/channel use) at different SNRs. 



Optimal Input Distributions: The optimal input distributions 
(given by the cutting-plane algorithm) corresponding to the 
optimal quantizers obtained above are depicted in Fig. [3] for 
different SNR values. The locations of the optimal quantizer 
thresholds are also shown (by the dashed vertical lines). Binary 
signaling is found to be optimal at low SNRs, and the number 
of mass points increases (first to 3 and then to 4) with 
increasing SNR. Further increase in SNR eventually leads to 
the uniform 4-PAM input, thus approaching the capacity bound 
of 2 bits. It is worth noting that all the optimal inputs we 
obtained have 4 or less mass points, whereas Proposition 2 is 
looser as it guarantees the achievability of capacity using at 
most 5 points. 

B. 3-bit Symmetric Quantization 

For 3-bit symmetric quantization, we need to optimize over 
a space of 3 parameters : {0 < q\ < qi < q^}, with the 
quantizer thresholds being {±91, ±92, ±<?3}- Instead of brute 
force search, we use an alternate optimization procedure for 
joint optimization of the input and the quantizer in this case. 
Due to lack of space, we refer the reader to [10] for details, 
and proceed directly to the numerical results. (Table [TTJ> 

Comparison with the benchmark: As for 2-bit quantization 
considered earlier, we find that the benchmark scheme per- 
forms quite well at low SNRs with 3-bit quantization also. At 
— 10 dB SNR, for instance, the benchmark scheme achieves 
83% of the capacity achievable with an optimal quantizer 
choice. Table HT1 gives the comparison for different SNRs. 

Optimal Input Distributions: Although not depicted here, we 
again observe (as for the 2-bit case) that the optimal inputs 
obtained all have at most K points (K = 8 in this case), while 
Proposition 2 guarantees the achievability of capacity by at 
most A'+l points. Of course, Proposition 2 is applicable to any 
quantizer choice (and not just optimal symmetric quantizers 
that we consider in this section), it still leaves us with the 
question whether it can be tightened to guarantee achievability 
of capacity with at most K points. 

C. Comparison with Unquantized Observations 

We now compare the capacity results obtained above with 
the case when the receiver ADC has infinite precision. Table 
irril provides these results, and the corresponding plots are 
shown in Fig. @] We observe that at low SNRs, low -precision 
quantization is a very feasible option. For instance, at -5 dB 
SNR, even 1-bit receiver quantization achieves 68% of the 
capacity achievable with infinite-precision. 2-bit quantization 
at the same SNR provides as much as 90% of the infinite- 
precision capacity. Such high figures are understandable, since 
if noise dominates the message signal, increasing the quantizer 
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precision beyond a point does not help much in distinguishing 
between different signal levels. However, we surprisingly find 
that even if we consider moderate to high SNRs, the loss due to 
low-precision sampling is still very acceptable. At 10 dB SNR, 
for example, the corresponding ratio for 2-bit quantization 
is still a very high 85%, while at 20 dB, 3-bit quantization 
is enough to achieve 85% of the infinite-precision capacity. 
Similar encouraging results have been reported earlier in 
[12], [13] also. However, the input alphabet in these works 
was taken as binary to begin with, in which case the good 
performance with low-precision output quantization is perhaps 
less surprising. 

On the other hand, if we fix the spectral efficiency to that 
attained by an unquantized system at 10 dB (which is 1.73 
bits/channel use), we find that 2-bit quantization incurs a loss 
of 2.30 dB (see Table IIVb . From a practical viewpoint, this 
penalty in power is more significant compared to the 15% loss 
in spectral efficiency on using 2-bit quantization at 10 dB SNR. 
This suggests, for example, that the impact of low-precision 
ADC should be weathered by a moderate reduction in the spec- 
tral efficiency, rather than by increasing the transmit power. 





Spectral Efficiency (bits per channel use) 




0.25 


0.5 


1.0 


1.73 


2.5 


1-bit ADC 


-2.04 


1.79 








2-bit ADC 


-3.32 


0.59 


6.13 


12.30 




3-bit ADC 


-3.67 


0.23 


5.19 


11.04 


16.90 


Unquantized 


-3.83 


0.00 


4.77 


10.00 


14.91 



V. Conclusions 

Our Shannon-theoretic investigation indicates the feasibility 
of low-precision ADC for designing future high-bandwidth 
communication systems such as those operating in UWB and 
mm-wave band. The small reduction in spectral efficiency due 
to low-precision ADC is acceptable in such systems, given 
that the available bandwidth is plentiful. Current research is 
therefore focussed on developing ADC-constrained algorithms 
to perform receiver tasks such as carrier and timing synchro- 
nization, channel estimation and equalization. 

An unresolved technical issue concerns the number of mass 
points required to achieve capacity. While we have shown 
that the capacity for the AWGN channel with A'-bin output 
quantization is achievable by a discrete input distribution with 
at most K + 1 points, numerical computation of optimal inputs 
reveals that K mass points are sufficient. Can this be proven 
analytically, at least for symmetric quantizers? Are symmetric 
quantizers optimal? Another problem for future investigation 
is whether our result regarding the optimality of a discrete 
input can be generalized to other channel models. Under what 
conditions is the capacity of an average power constrained 
channel with output cardinality K achievable by a discrete 
input with at most K + 1 points? 
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TABLE IV 

SNR (IN DB) REQUIRED FOR A GIVEN SPECTRAL EFFICIENCY. 



